Complex Certainty Factors for Rule Based Systems - Detecting Inconsistent Argumentations

نویسنده

  • Taïeb Mellouli
چکیده

This paper discusses the anomaly of gradually inconsistent argumentations when reasoning under uncertainty. It is argued that in several domains, uncertain knowledge modeling experts’ opinions may induce inconsistencies to a certain degree in interim and final conclusions. In order to model gradual/partial inconsistency, complex certainty factors are introduced and their serial and parallel propagation within rule-based expert systems is presented. Our complex certainty factor model, representing and propagating belief and disbelief separately, sheds light on the meaning of inconsistency degrees and their persistence within argumentations under uncertainty. For the methodology capable of this separate propagation, complex certainty factors for facts are designed as twoand for rules as four-dimensional value tuples. Requiring local consistency of knowledge, we show that only two dimensions are necessary for rules, and based on this finding, deliver a simple graphical visualization suitable for expert’s knowledge acquisition. Finally, we categorize gradual inconsistencies and discuss their handling. 1 Motivation: Rules, uncertainty handling, and inconsistency Assisting experts in their decisions and actions can be performed by modeling the environment by a knowledge base and asking for entailed consequences in form of derivations for expert’s expressed goals from the knowledge base. A widely used form of knowledge representation consists of facts (data) and if-then rules (production rules) for expert systems or rule-based systems. Efficient basic algorithms are known which act either in a forward-chaining, data-driven, bottom-up manner by applying rules from facts over derived interim results to goals (production view), or in a backward-chaining, goal-driven, top-down manner reducing derivations of goals to those of subgoals until reaching facts (goal/problem reduction). In case of certain knowledge, a goal may admit several derivations using a collection of facts and rules and it is known that only one derivation suffices to show entailment from a consistent (Horn) knowledge base. In case of uncertain knowledge, the methodology of rule-based systems, logic, and logic programming cannot be transferred in a straightforward manner. In their seminal work on modeling inexact reasoning in medicine, Shortliffe and Buchanan [11] propose the use of certainty factors (CF), real numbers between -1 and 1, for facts and rules, expressing measures of increased belief (positive CF) or disbelief (negative CF) according to acquired evidence, and describe within their MYCIN diagnosis system the propagation of certainty factors for derived interim and final conclusions/goals within a forward-chaining inference framework. Besides calculating CFs for logical expressions of rule conditions and propagating CFs in rule application (serial propagation), a new issue occurs whenever several derivations exist for the same conclusion/goal, such as for the same hypothesis in medical diagnosis. Whereas such a situation is not very interesting for certain knowledge—simply taking one of the derivations/argumentations as a proof for a goal (with certainty)—two derivations for the same hypothesis, each with an uncertain belief measure out of different pieces of evidences, are regarded to constitute a situation of incrementally acquired evidence for the same hypothesis and would lead to a stronger belief in that hypothesis (parallel propagation). This parallel propagation can not only be applied to two measures of increased beliefs and similarly to two measures of increased disbelief, but also to mixed belief situations where a measure of increased belief (positive CF) and a measure of increased disbelief (negative CF) are previously calculated for the same hypothesis or subgoal. This situation leads to a positive CF, if belief is of higher degree, to a negative CF, if disbelief is of higher degree, and to zero if measures of belief and disbelief are equal. The two versions of MYCIN formulas for parallel propagation do not apply to combine certain belief (+1) and certain disbelief (-1)—the case of absolute inconsistency. This paper recognizes a deficiency in the latter kind of calculations from a modeling point of view when reasoning with experts’ opinions and rules which could lead to (degrees of) contradictions due to (partially) inconsistent argumentations and derivations for goals and subgoals. We introduce complex certainty factors to manage these contradicting opinions leading to combined measures of increased belief and disbelief. Calculations of complex certainty factors enable to recognize conflicting subresults and propagate degrees of inconsistency until final goals and conclusions. In our opinion, the idea and visualization of the proposed complex certainty factors will throw light on the problem of gradual inconsistency within uncertainty reasoning. The author is aware that starting with works of Heckermann and Horovitz [6] and Pierce [10], in which several anomalies in “extensional approaches” like the CF model for uncertainty reasoning are discussed and in which belief networks as an “intentional approach” based on Bayesian probabilistic inference are declared to constitute a superior model for reasoning with uncertainty, a considerable part of the AI community followed this opinion including the developers of MYCIN themselves (Heckerman and Shortliffe [7]). Extensional approaches, viewed as suffering from modularity together with locality and detachment, “respond only to the magnitudes of weights and not to their origins” [10] and therefore lack a proper handling of distant correlated evidences. However, the problem of partial/gradual inconsistency addressed in this paper describes another type of anomaly of reasoning with uncertainty and we are not aware of a resolution of this anomaly in nonor quasi-probabilistic (extensional) or probabilistic (intentional) systems including belief networks. In Sect. 2, we emphasize the relevance of the inconsistency anomaly by considering some business applications where expert’s knowledge could lead to inconsistencies. In Sect. 3, we review the MYCIN certainty factor model and discuss some general interpretation issues, such as properties of degrees of confirmation and disconfirmation. We define the notion of local belief consistency and distinguish absolute and uncertain belief/disbelief as well as absolute and partial inconsistency. The anomaly of gradual inconsistency is illustrated by a fictive example of experts’ ratings of derivatives related to the financial crisis. In order to model gradual inconsistency, we introduce complex certainty factors in Sect. 4 and present their serial and parallel propagation within a rule-based expert system in Sect. 5. In order to propagate belief and disbelief separately, more complex certainty factors for rules are necessary, still under requirements of local consistency of knowledge, they could be simplified (cp. 5.3). A simple graphical visualization for expert’s knowledge acquisition follows. Detecting inconsistencies in expert’s argumentations is illustrated by applying our model to the financial crisis example in subsect. 6.1. Though our ideas to handle the anomaly of gradual inconsistency are designed using the CF model, they are applicable to other formalisms as well. Reasoning with complex certainty factors do not only sum up evaluation of a decision by a figure like certainty factor, probability or likelihood ratio, but can also evaluate distrust and skepticism (cp. 6.2) whenever different argumentations lead to partially conflicting conclusions. In 6.3, we retrospectively interpret the phenomenon of gradual inconsistency, distinguish inherent and apparent inconsistency in the course of uncertainty reasoning and show techniques to resolve recognized types of inconsistencies leading to future works (Sect. 7). 2 Expert knowledge and inconsistency in business applications Many decision problems in business, economics, society, and politics are based on predictive knowledge and expert evaluations that are prone to hidden conflicts and inconsistencies. They represent partial information on cause-effect relationships with a lack of exhaustive frequency or (a-priori and conditional) probability data being a prerequisite for building belief networks. Inference systems should be able to handle experts’ opinions as partial and modular knowledge about cause-effect, influence, and relevance relationships as well as selective association rules extracted by data mining techniques. One application domain lacking complete probability data is risk evaluation of new technologies; only (uncertain) expert opinions about causal relationships are known concerning future consequences. Examples are relationships between greenhouse effect and global warming, between environmental contamination, damage and catastrophes, as well as effects of extensive use of mobiles and social media on children’s mental growth. TV shows with debates of experts of different schools of thought often exhibit that controversial and opposite opinions may lead to inconsistencies in argumentations. Also, knowledge and rules of the same expert may sometimes induce indiscernible inconsistency within a subject. In politics, it is not rare to find “experts” who preach democracy principles and human rights but support dictatorships because of hidden economic interests. Such kind of inconsistency cannot be detected easily by TV spectators confronted with experts’ opinions on complex problems such as globalization, currency devaluation, political instability, middle-east conflict and Arab spring. Furthermore, there are some areas such as law and jurisprudence where knowledge to be applied is normative. Besides informative knowledge considered as descriptive, helping with conceptual understanding, normative knowledge is seen as prescriptive showing how to comply (with law). Normative orders and systems are not only relevant in jurisprudence, but also characterize scientific branches like normative economics and normative ethics. Normative knowledge includes requirements usually expressed using the verbal form “shall” for a necessary conclusion as a (generally formulated) judgment if a given list of conditions is fulfilled. The application of this modular knowledge in court proceedings is subject to uncertainties in the given evidences or facts of the case. In a criminal case, punishments may heavily differ in extent from monetary penalty to several years of prison, depending on the final refined judgment that may be a standard burglary, robbery or armed robbery. Evidences and facts of a case like “has the robber a knife, a pocket knife, etc.”, “is it considered as dangerous tool” are subject to uncertainties. In a course of an analysis scheme based on these evidences, the judge should infer belief about intention (negligent, grossly negligent, etc.). From the other hand, he examines exculpations (distress/emergency states) which can lead to disbelief. Thus, given both positive belief and disbelief in some aspects of the judgment, one is confronted with partial inconsistency in the concluding judgment or within an intermediate conclusion. Our method is able to propagate these partial inconsistencies until the concluding judgment. Only at the concluding judgment, the lawyer has to weigh pros and cons (belief and disbelief) as well as argumentations for and against, in order to finally judge the criminal case. In our opinion, normative knowledge is inherently modular and cannot deliver the necessary conditional probability tables required for belief networks. So using our complex certainty factors for modular knowledge is one method of choice. 3 Certainty factors and the inconsistency anomaly In introducing certainty factors of facts and rules for modelling uncertain knowledge and discussing their serial and parallel propagation, we stress on several general interpretation issues: properties of belief and disbelief, difference to probability, local consistency, absolute and gradual belief and inconsistency. A motivating example concerning experts’ rating of derivatives and financial crisis illustrates the anomaly of gradual inconsistency that is shown to be improperly handled by the certainty factor model. 3.1 Certainty factors and their relationship to probabilities A common application of uncertainty reasoning is classification and diagnosis. Some observations (symptoms, evidences) can be linked by rules to solutions (hypotheses, diagnoses, diseases). Rules are associated expert’s estimates of confirmation/disconfirmation or belief/disbelief by an (un-)certainty measure, as in a MYCIN example [11]: IF: E1) The stain of the organism is gram positive AND E2) The morphology of the organism is coccus AND E3) The growth confirmation of the organism is chains THEN: there is suggestive evidence (CF = 0.7) H) that the identity of the organism is streptococcus Generally, a certainty factor CF(H,E), denoted here CF(H|E) for convenience, is a real number in [-1...1] representing a measure of increased belief in the hypothesis H given an acquired evidence E, if it is positive, and a measure of increased disbelief in (belief against) the hypothesis H given the evidence E, if it is negative. While a certainty factor of 1 corresponds to “definitely certain” and -1 to “definitely not” or “certainly against” a hypothesis, certainty factors for linguistic utterances “weakly suggestive”, “suggestive”, and “strongly suggestive” evidence may range from 0.2 to 0.95, and for “almost certainly not”, “probably not” and “may be not” may range from -0.95 to -0.2. A first formula for certainty factors CF(H|E) of the rule “if E then H” adopted by MYCIN in terms of a measure of (increased) belief MB(H|E) and a measure of increased disbelief MD(H|E), given an acquired evidence E, is simply the difference: CF(H|E) = MB(H|E) − MD(H|E) (1) Shortliffe & Buchanan [11] note that the above rule example reflects their collaborating expert’s belief that gram-positive cocci growing in chains are apt to be streptococci, where a 70% of belief in the conclusion is uttered. They noted that translated to the notation of probability, the rule with CF=0.7 seems to say P(H|E1,E2,E3) = 0.7. The expert, they say, may well agree with this, but he definitely not agree with the conclusion that P(¬H|E1,E2,E3) = 1 P(H|E1,E2,E3) = 1 0.7 = 0.3. The expert claims, that “the three observations are evidence (to degree 0.7) in favor of the conclusion that the organism is a Streptococcus and should not be construed as evidence (to degree 0.3) against Streptococcus”. Thus, CF(¬H|E) is not equal 1 CF(H|E). Accounting for this difference, Shortliffe and Buchanan [11] fix CH(H|E) = 0 for the case the hypothesis H is probabilistically independent from the evidence E, that is, for P(H|E) = P(H). In this case both MB(H|E) and MD(H|E) are equal to zero: MB(H|E) = 0 and MD(H|E) = 0 for P(H|E) = P(H) (2) For the case the evidence E supports belief in H, P(H|E) > P(H), they define: MB(H|E) = P(H|E) − P(H) (1) − P(H) and MD(H|E) = 0 for P(H|E) > P(H) (3) By this definition, the measure of increased belief MB(H|E) can be interpreted as the ratio of increase of probability of P(H) to P(H|E) after acquiring the new evidence E relative to the possible increase distance from P(H) to 1, full certainty for H. For the case the evidence E supports disbelief in H (belief against H), P(H|E) < P(H), we get: MD(H|E) = P(H) − P(H|E) P(H) − (0) and MB(H|E) = 0 for P(H|E) < P(H) (4) Likewise MD(H|E) can be interpreted as the ratio of decrease of probability of P(H) to P(H|E) after acquiring E relative to the distance from 0, full disbelief in H, to P(H). Heckermann [5,7] multiplies denominators of (3) and (4) by the extra terms P(H|E)-0 for MB and 1-p(H|E) for MD, making the definitions symmetric in P(H) and P(H|E) and justifying parallel propagation (15). Further, when P(H) approaches 0 with P(H|E) fixed, MB(H|E) converges to P(H|E) in the original and to 1 in Heckermann’s definition. He maps the likelihood ratio λ = P(E|H) P(E|¬H) ∈ ]0,∞[ to CF ∈ ] − 1,1[ by CF = λ−1 λ for λ ≥ 1 and CF = λ − 1 for λ < 1 and applies Bayesian inversion formulas P(E|H) = [P(H|E) ∗ P(E)]/P(H) and P(E|¬H) = [P(¬H|E) ∗ P(E)]/P(¬H). We will not dwell on probabilistic justifications of the CF model which were already subject of many papers. Of concern are here only some desired properties that remain true with these definitions of MB and MD operationalizing degrees of confirmation and disconfirmation:  The measure of increased disbelief in H after acquiring evidence E is equal to the measure of belief in ¬H after acquiring evidence E and vice versa: o MD(H|E) = MB(¬H|E) (5) o MB(H|E) = MD(¬H|E) (6)  For each rule, not both measures of increased belief and of increased disbelief can be positive (local belief consistency): o MB(H|E) > 0  MD(H|E) = 0 (7) o MD(H|E) > 0  MB(H|E) = 0 (8) From (5) and (6) it follows according to CF definition (1) that: CF(¬H|E) = ̶ CF(H|E) (9) Properties (7) and (8) prescribing what we call local belief consistency are crucial, since the same piece of evidence cannot both favor and disfavor the same hypothesis. Thus formula (1) is stated for convenience, instead of stating CF(H|E) = MB(H|E), if MB(H|E) > 0 and CF(H|E) = ̶ MD(H|E), if MD(H|E) > 0. As Heckermann [5] states, we assume that probability and belief measures are to be understood as subjective according to the same expert with prior knowledge k about the domain. So P(H|E) can be seen as P(H|E,k), P(H) as P(H|k), MB(H|E) as MB(H|E,k), MD(H|E) as MD(H|E,k), and CF(H|E) as CF(H|E,k). For a fact E, CF(E) can be seen as a rule’s CF: CF(E|k). Precisely, Heckermann denotes CF(H|E,k) as CF(HE, k) to account for the matter of fact that the expert knowledge somehow conditions the whole expert’s opinion about CF of the rule and that a diagnostic rule if E then H actually models the reciprocal causality that the hypothesis/disease H causes the appearance of the evidence E. 3.2 Certainty factors of compound evidence and their serial propagation Given an if-then-rule (R) with certainty factor CFR (R) if condition/evidence E then conclusion/hypothesis H (CFR) firstly compute the CF(E) out of CF of the members constituting the expression E and then compute CFR(H) of the conclusion by serial propagation of CFs: 1. Calculate CF(E) for E an expression using conjunction, disjunction and negation: o CF ( e1  e2 ) = and(CF(e1), CF(e2)) := min(CF(e1), CF(e2)) (10) o CF ( e1  e2 ) = or(CF(e1), CF(e2)) := max(CF(e1), CF(e2)) (11) o CF ( e ) = ̶̵ CF(e) (12) 2. Calculate CFR(H): o If CF(E) > 0 then CFR(H) = CF(E) * CFR (13) o If CF(E) ≤ 0 then the rule (R) is not applicable (14) Whereas the min-function for conjunction of evidence in (10), as a possible t-norm, is adequate for e1 and e2 being completely or strongly overlapping, another t-norm CF(e1  e2 ) = CF(e1)*CF(e2), less than min(CF(e1), CF(e2)), is more adequate, if e1 and e2 are independent. We propose to attach to each rule individual variants of t-norm/t-conorm for computing CF of conjunction/disjunction of evidences according to the evidences’ grade of overlapping/dependency/disjointedness (see below). It is important to note that serial propagation do only apply to the case CF(E) > 0, or practically using a threshold, e.g. CF(E)  0.2 as for MYCIN. Take the rule (R1) “if it rains then the grass gets wet” with certainty factor 0.9. If it rains, we can infer grass is wet with certainty factor CF1 = 0.9. It is clear that if it doesn’t rain CF(Rain) = -1, we cannot infer CF(WetGrass) = -1*0.9 = -0.9, since grass may be wet, for instance, because of the sprinkler being on. The asymmetry in (13) and (14) accounts for the intuition of experts working with rule-based systems, who commonly tell that the presence of evidence E increases belief in a hypothesis H, but the absence of E may have no or negligible significance on H. So for the case CF(Rain) = -1, we have CF(Rain) = 1 and this negated evidence is only invoked with a rule with negated evidence like “If it doesn’t rain, then grass is not wet” that may be associated a significantly lower CF, as 0.3, depending on the expert’s knowledge over other relevant causes in the domain making grass wet. This CF is nearly 0, if a sensor automatically turns the sprinkler on. Further, knowledge engineering with certainty factors should be either causal or diagnostic in order to avoid strange feedback loops, as for the causal rule (R1) together with the diagnostic rule (R2’) “if grass is wet, then sprinkler is on” with CF2’ = 0.4. Then one can infer from CF(Rain) = 1, that CF(SprinklerOn) = (1*0.9)*0.4 = 0.36. Clearly, the fact that it rains would “explain away” that the sprinkler is on, thus CF(SprinklerOn) should be near to zero. While inter-causal reasoning can be better handled by belief networks, the situation is better modelled by two causal rules or by one compound causal rule using disjunction: (R12) If Rain  SprinklerOn then WetGrass. For the rule (R12), we propose to attach another t-conorm, such as CF(R  S) = CF(R) + CF(S) ̵̶ CF(R)*CF(S), greater than max(CF(R), CF(S)) of (11), for R=Rain being independent of S=SprinklerOn or even CF(R  S) = min(1, CF(R) + CF(S)) assuming that R and S are (almost) mutually exclusive events. 3.3 Parallel CF propagation and belief substantiation of co-concluding rules The case of parallel propagation of certainty factors applies when two rules have the same conclusion or hypothesis H (two co-concluding rules): (R1) if E1 then H (CFR1) (R2) if E2 then H (CFR2) Let the certainty factors for H be: x = CFR1(H) and y = CFR2(H) as calculated by serial propagation of (R1) and (R2), then the resulting certainty factor for H is calculated by:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Fault Detection and Isolation Method Based on Belief Rule Base for Industrial Gas Turbines

Real time and accurate fault detection has attracted an increasing attention with a growing demand for higher operational efficiency and safety of industrial gas turbines as complex engineering systems. Current methods based on condition monitoring data have drawbacks in using both expert knowledge and quantitative information for detecting faults. On account of this reason, this paper proposes...

متن کامل

USING DISTRIBUTION OF DATA TO ENHANCE PERFORMANCE OF FUZZY CLASSIFICATION SYSTEMS

This paper considers the automatic design of fuzzy rule-basedclassification systems based on labeled data. The classification performance andinterpretability are of major importance in these systems. In this paper, weutilize the distribution of training patterns in decision subspace of each fuzzyrule to improve its initially assigned certainty grade (i.e. rule weight). Ourapproach uses a punish...

متن کامل

A rule-based evaluation of ladder logic diagram and timed petri nets for programmable logic controllers

This paper describes an evaluation through a case study by measuring a rule-based approach, which proposed for ladder logic diagrams and Petri nets. In the beginning, programmable logic controllers were widely designed by ladder logic diagrams. When complexity and functionality of manufacturing systems increases, developing their software is becoming more difficult. Thus, Petri nets as a high l...

متن کامل

Taking a gamble or playing by the rules: Dissociable prefrontal systems implicated in probabilistic versus deterministic rule-based decisions

A decision may be difficult because complex information processing is required to evaluate choices according to deterministic decision rules and/or because it is not certain which choice will lead to the best outcome in a probabilistic context. Factors that tax decision making such as decision rule complexity and low decision certainty should be disambiguated for a more complete understanding o...

متن کامل

Role of Certainty Factor in Generating Rough-fuzzy Rule

The generation of effective feature-based rules is essential to the development of any intelligent system. This paper presents an approach that integrates a powerful fuzzy rule generation algorithm with a rough set-assisted feature reduction method to generate diagnostic rule with a certainty factor. Certainty factor of each rule is calculated by considering both the membership value of each li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014